Enhanced Optical Character Recognition

•

1. The biggest problem that every company faced when implementing OCR-based information system is always related to the accuracy that never meets the expected level by all of the stakeholders.

2. An external cloud-based OCR processing service will always receive an input of an image file. The biggest constraint of this method is that every image has to have a consistent quality and same photo-graphical environmental variables. This is to ensure that every text on the image will be read equally across all different images. But for most of the real-world usage, this consistency is impossible.

3. Even after the OCR has done processing the image and returning the read text, there still need an automated validation process to ensure that the text is valid according to the document's format. More often than not, this will return false positives that will impact negatively to the whole system and the users as well.

The Enhanced OCR feature that we are about to explain is part of our Water Company Information System project. The project is aimed to be able to develop a system that can help a water company automate their operation through the comprehensive features that lie inside the project from the water meter reading, the billing process, as well as the strategic managerial functionalities.

Standardized Photo Captures

Making sure that the user who gives input to the system follows the above guidelines for taking the proper photo of the object that is going to go straight through the OCR reading process.

Before submitting the image, there will be an automated cropping mechanism running in the background to make sure only the important bit of information being sent to the external cloud-OCR service. This way, it can dramatically reduce the numbers of error as well as the need to extensively clean-up for foreign characters and lines.

Image Preprocessing Techniques

Following is that the process that every image will go through before being sent to the cloud OCR service, to ensure that the OCR service can recognize the characters easier.